update tooling chat template

#1
Technology Innovation Institute org
No description provided.

how i can use it with vllm?

Technology Innovation Institute org

Hello @suzii ,

pip install vllm

vllm serve tiiuae/Falcon-H1-1.5B-Instruct --tensor-parallel-size 2 --data-parallel-size 1

You can visit our official githubpage for more details about other inference frameworks.

DhiyaEddine changed pull request status to merged

hi @DhiyaEddine
sorry for the unclear question, I mean how to use model with vllm and use tools calling, I tried

vllm serve tiiuae/Falcon-H1-7B-Instruct --tensor-parallel-size 2 --enable-auto-tool-choice --tool-call-parser llama3_json --host 0.0.0.0 --port 8080 --chat-template falcon.jinaji

but it doesn't work

Technology Innovation Institute org

@suzii what is the error you get?

Technology Innovation Institute org

Note also that ee have today pushed a fix for the chat template with tool calling for all falcon H1 series of model, maybe that would help.

I have already tried on llama3_json and hermes parser (with "update tooling chat template52539b39" chat template). No error but it can not call any function.

Sign up or log in to comment